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ABSTRACT 

A study tfas conducted 'to further validate the 
"Watson-Barker Listening Test." The subjects, 120 students enrolled 
in basic speech courses, completed the Receiver Apprehension Test 
(RAT) end the .Watson-Barker Listening Test: Form A. Statistical 
analysis of the results revealed a significant correlation between 
the RAT scores, and both the long term memory and the total listening 
measures on the Watson-Barker test, but not between the RAT scores , 
and the short term memory measure. The results only partially 
supported claims of validity for -the Watson-Barker instrument; 
(HTH) - 
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ABSTRACT 

Preliminar y Research Employing the Watson^Barker Listening Test: 

A Validation of the Instrument 



The face validity of the Watson-Barker Listening Test p reviously has been 
established through inspection by listening theorists. This study sought additional 
support for these claims of validity. One hundred twenty students enrolled in 
basic speech course* were asked to complete the Receiver Apprehension Test (RAT) 
and take the Watson-Barker Listening Test: Fdrm A. Statistical analysis of the 
data revealed a significant correlation between RAT scores and both Long Term 
Memory and Total Listening, but not between RAT scores ard Short Term Memory. 
The significant relationships were curvilinear in nature, as expected, based on the 
retevant literature. It was concluded that the ck «ms of validity for the 
Watson-Barker Instrument are partially suppoi by this data. 
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PRELIMINARY RESEARCH EMPLOVlNG THE WATSON-BARKFR 
USTEMNE TEST: A VALIDATION OF THE INSTRUMFNT * 

1 * * 

The Watson-Barker Listening Teat vas developed in 1982 in an attempt 
to create a standardized listen ins test that would be oriented primarily 
toward adults and mature college level aodiences (Watson and Barter. 
1984). While a number 6J* reliability analyses were conducted and 
acceptable levels of reliability were established, the only measure of 
validity undertaken was thai of "face validity" (Watson and Barker. 1984. 
p.l). Given the diverse definitions of "listening" held by various 
listening experts, such support is not totally reassuring. Currently 
experiments are being conducted in an attempt to link test results of the 
Watson-Barker instrument with those of other listening tests suck as the 
Kentucky Comprehen sive Listening Test . While such experiments will 
help to establish the efficacy of comparing Jala of the various tests, they 
provide only a tautological validation of the instruments. If all tests arcs 
highly correlated and if any one test is Valid, then the validity claims of 
all tests can be accepted. If no check of validity other than that of "face 
validity" is performed, all such claims should be held in abeyance until 
the concept of "listening" is agreed upon substantively by listening 
theorists. 

The problems of establishing the validity of listening tests are 
monumental. There is quite a bit of disagreement concerning which 
various subp?oces*es should be included within the conceptualization of 
listening. Is listening a combination of "hearing, understanding, and 
retaining" information, or should other subprtcesses be included or some 
of these be excluded (Bostromj 1983)? Regardless of the various 
conceptualizations, it appears clear from the nature of the instruments 
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being used to measure "listening ability" that the one subprocess tf.ut is 
central to the measurement of listening is the "recall" or *recognitK. t r of 
retained information. AU testa share a common method. SubjecU erv 
asked to listen to a message.. or set of stimuli, and then are asked ^ recall 
or recognixe various parts of that message" or set of stimuli either * 
immediately after hearing the testfcasssges or at some delayed time 
thereafter. While the nature of the test passage varies from instrument 
to instrument, this procedure seems invariant. 

Another constant appears to be the effort on the part of the designers 
to hold "listening motivation" constant for all subjects. All of the major 
tests of listening ability are administered in such a manner so that all 
subjects are aVare that their listening is to be tested. Kelly. (1967) points 
out the problems of external validity using this procedure when he notes, 

» 

Ye have a massive body of information about the listening 
behavior of subjects who knew the^ were going to be tested. . , 
' but we/have done almost nothing to find out about performances 
- across the general range of situations from panic to 
boredom (p 464) ) . 

j 

This is crucial to the external validity of listening tests when one 
considers that one oT the most consistent findings in listening research 
has been that the recall of material is facilitated by increases in extrinsic 
motivational cues. Forewarning of a test has been found to be such a cue 

Knowledge that a test will follow a listening experience has been 
labeled "anticipatory set." Anticipatory set creates the real possibility 
that a "ceiling effect" may be established. Procedures that are common in 
listening measurement severely limit the free functioning or any 
entecendent listening ability, as would be manifested in a 
"non-laboratory" situation. This phenomenon has been reported by many 
researchers (See. for example. Anastasis. 1961; Kelly. 1962. 1963.. 1967.). 
CroQcn and lfihevc (1972) discuss how subjects under "aware" conditions 
actively listen to messages so that they might answer questions 
concerning the material at a later time. The effect of forewarning is to 
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raise the motivational forces naturally at work in the typical listener as 
high as his mental ability win A n ow and to disallow the differential 
functioning of other pertinent variables upon the comprehension and 
retention of material (Colly. 1967). This mav well be the reason that ' 
correlations between measures of mental ability and intelligence, and 
such listening tests * the Brown- Carlsen Listening Comprehension Test 
and the SHE hare been so high (Keller. 1 960 :Pe trie, 1961; Anderson and 
Baldauf. 1963). ' 

Listening test designers should not be uninterested in studying, the 
listening behavior of subjects under these conditions. Many classroom 
teachers hope that t/iese conditions eiist for them in their various 
courses. ..However, even a cursory inspection of the most ideal classroom 
wilt reveal that students are not motivated to listen, day in and day out. to 
the information presented them. Many students seem to be content to 
remember information only so long as it tales to place that information 
in their notes. In any case, conditions where testing is immanent are not 
likely to be found in most other situations. 

Of particular interest then is the extent to which scores obtained in 
controlled conditions of standardized motivation reflect the listening 
ability of subjects when they vent re outside the laboratory 
environment. Resolving this question of external validity is not an easy 
task, given the nature of the listening instruments extant today. While 
the Watson -Barker test does contain stimuli that are capable of being 
generated In a non-laboratory setting, the task of getting even one. 
subject to respond to questions that would mirror the content of the teat 
under conditions of "nonavareness of the intent to test" is too huge to 
seripusly consider. 

Another method is available for establishing the external validity of 
listening tests. Groups of "good listeners" and groups of "poor listeners" 
could be given listening tests. In this manner the validity of the ' 
instrument could be established. However, before suctf procedures could 
be completed, the aforementioned "definitional debate as to what 
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constitutes "good listening" vould have to be settled. 

Bostrom (1984) argues that one method of establishing validity is to , 
illustrate that the instrument in question Measures a unique 
characteristic. He comperes a wide variety of tests vith his listening 

# 

instrument to illustrate its uniqueness. While his data is compelling 
evidence that his instrument measures a unique construct, he presents no 
evidence that his instrument measures "listening ability." to say that 
something is not several o(her things is not the same as saying it is_vhat 
he says it is. He° continues his quest for validity by illustrating that , 
certain groups respond differently than others. Specifically he 
indicates that college students, army officers, and high school students 
have different performance levels. Knowing several members of each 

* » * * * 

subject set. 1 suggest that none of the sets can boast of a uniform level of, 
listening ability. This is not to say that his instrument does not measure 
listening ability. Rather it is to suggest that he has not substantiated his 
case for the validity of his instrument in the eyes of this observe*. v * 
At least one other meCfiod for severing this tautological Gordian knot 
▼as suggested by the efforts of Bostrom (1984). While uniqueness is one, 
characteristic of validity, shared commonality, as evidenced by 
significant correlations, with valid measures of a phenomenon is/ 
acceptable support of a contention of validity. ^There are tests of . 
established validity that are conceptualized to measure certain aspects of 

« 

the listening domain. One such instrument is the Re ceiver Apnrehc (ion 
Test ( Wheeless. 1973). This instrument measures ihc self-reported 
ansiety of subjects that is associated vith listening to stimuli generated 
\ in a variety of situations. It has been studied in terms of its relationship 
to other self-report measures (Beattv. 1981; Beatty and Payne. 1981) and 
its psychometric properties (Beatty. in press). Of particular note in the 
established correlation of RAT scores and physiological arousal (Roberts. 
1980. 1984). This becomes even more important vhen the correlation 
betveen arousal and retention is entered into the equation. A number of 
researchers have established a link between retention and arousal 
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(Ilcinsmith end tap lan. 1963: Crane etaJL 1971; Roberts. 1980). The 
relationship between arousal and retention is posited to be curvilinear in 
nature. whMe the renltlonshlp between physiological arousal and RAT 

scores is linear. Since listening ability is said to reflect short term 

• * » 

retention and long term retention ability, in part, then there should be a 

_J • *■ 
correlation between RAT scores and scores on valid listening tests. This 

v 

relationship would be curvilinear in nature. Too much or too little 
physiological arousal, as indicated by RAT scores, would result in poorer 
* retention scores, as reflected by scores on a listening test. Optimum 
levels of arousal would result in higher retention scores. 
Thus, ch& following hypothesis was conceived: 

There is a curvilinear relationship between receiver 
apprehension, as measured by the RAT. and listening ability, as 
measured by the Vatson -Barker Listening. Test. 

MFTHOD . 

SUBJECTS : Subjects were 127 volunteer undergraduate students. 42 males 
and 85 females, enrolled in oeginning speech communication courses at 
lfcNeese State University during the Spring semester of 1933. D^ta'of 
seven of the subjects was subsequently discarded for several reasons. 
Three of the subjects vere froe^ other countries and their grasp of the 
English language prohibited aji accurate test of their listening ability. 
Four other subjects did not complete one or both of the instruments 
utilized in this experiment. 

PROCEDURE : At the beginning of the Spring semester, students in six 
sections of a basic speech communications class were asked to volunteer 
for ah experiment. The purpose of the experiment was explained to them 
in 'detail and the procedures that would be followed were outlined. They 
were assured that the tests would have no impact on their grade, nor 
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would their decision to participate or not participate afreet their standing 
in the class. With only one exception, all students agreed to participate, 
the one non-volunteer vas eicused frost the neit class meeting. 

At the next class meeting the subjects were asked to complete the 
Receiver Apprehension Test ( Whoolets. 1973). After collecting the RAT. 
subjects vera asked to complete the Watson-Barfcex Lis tening Teat. Form A 
(Watson and Barker. 19S4). This test requires students to listen to a 
twenty minute audio tape and answer questions based on the information 
presented on the tape. There are rive different types of listening tasks 
nsked or the subjects. Each section or the test is comprised or ten 
questions. Three or the sections are said to test "short term memory 5 

to • ' 

skills' and the remaining two sections are purported to assess "long term 
memory skills" (Watson and Barker, 1984). The test tape begins with a 
short passage that allows the experimenter to insure that all subjects can 
hear the tape adequately. Arter adjusting the volume control or the tape 
player, the tape was played for the subjects, paujing only briefly to allow 
subjects to turn the pages of their test booklets when required. 
Although these pauses were not called for in the instructions provided 
with the test, they were deemed necessary because or the potential for 
distortion that the extraneous noise presented. The actual test time 
required varied slightly rrom class to class (the average time required for 
completing the Watson-Barker Lis tening Test was approximately 30 
minutes). Arter the subjects had completed the test, their answer sheets 
were collected, they were asked to refrain from discussing the tests with 
others who might subsequently participate in the cipcrineat. and were 
assured that their test "answers would be evaluated . shared and explained 
to them at the next regular meeting of the class. 

RESULTS 

The completed tests were scored according to directions provided by the 
designers or the two instruments. As indicated above, four of the subjects 
railed to, complete one or both or the tests and the tests of three other 

> 
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. subjects Were discarded because it was evident that they did not 
understand English veil enough to hare their listening ability 
effectively measured by the Vatson-Barker instrument. Pearson 
product-moment correlations were obtained for the scores of the* 
remaining 120 subjects on the RAT and the Vatson-Barker test measures 

1 of short term memory, long term memory, and total listening ability 
(short term memory plus long term memory). As suggested by the 
literature concerning the nature of the relationship between arousal, as 
tapped by the RAT instrument, and the retention dimension measured by 
listening tests, no significant relationships were established for total 

. listening ability . short term listening, or long term listening- ' 
(respectively the results were r-. 12. r-.13.r-^06;p>.03). 

While a certain level of arousal is necessary to perform cognitive tasks 
successfully, arousal levels beyond the optimum "readiness" level are 
dysfunctional (Cofer and Appiey. 1964). As indicated above, previous 
research has shown that there is a significant linear correlation .between 

^RAT scores and physiological arousal, A direct reL ♦ionship between 
memory and physiological arousal has been established as veil. This 
relationship has been shown to he curvilinear in nature, in line with the 
"Activation Hypothesis" of Cofer and Appiey. Since the Vats© a -Barker 
instrument does claim to measure retention, the relationship between it 
and the RAT most probably would not be linear in nature, but rather 
would be curvilinear in nature. The further the RAT scores are from the 
mean RAT score, the lower the Vatson-Barker scores should be. 

To test this proposed "inverted U-shaped" relationship, the 120 scores 
were arrayed on a scatter diagram and 1 visually examined. This analysis 
strongly suggested that the relationship was not linear in nature. To 
statistically test this relationship the RAT scores of the 120 subjects were 
converted to absolute scores frost the mean of the population 
(mean -40.89) and Pearson product-moment correlations were obtained for 
the adjusted RAT scores and the Vatson-Barker scores or short term 
memory, long term memory, and total listening ability (Rosenthal And 
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' Roftnow. 1984 pp.222-224). Significant relationships were found to exist 

between the adjusted SAT scares aad leag term memory (r— .20. p<.03> aad 
between the adjusted RAT scares aad total listening ability (r- .21. p<.02), 
but not between the adjusted RAT scores and short tersi siesiory (r— .12. 
p<.18h The poorer of the correlation test was .71 (Cohea. 1977). 

BlSCUSSlfla 

The hypothesis was supported * «tn regard to the relationships among 
too RAT scores aad both long term memory and total listen £ng ability, but 
not between short term memory aad RAT scores. Previous researchers 
hare suggested a strong link between arousal and long term retention, 
and a relatively weaker link between arousal and short term retention 
CLevoaian. 1967; Roberts. 1980). These 'findings are in line with those 
results. Takes together with the prerious literature on the 
arousal-retention relationship, thil study provides evidence for the 
validity claims of the Watson-Ba rker Listening 'Test . 

Establishing the validity of any new instrument is diff "cult. Given the 
relatively small portion of variance of listening scores that is accounted 
Jor by the RAT measure, definitive conclusions concerning the validity of 
this new instrument must wait for additional data collection. Although 
the amount of variance accounted for is small, its magnitude is in nine 
with -Barker's (1984) conceptualization of listening vbJch- posits at least 

sis different subprocesses as being involved with the listening process. 

* *. 

"Recall" is only one of *.hese six processes and the only one to which the 
RAT has been empirically linked. It may well be that recall is of less 
importance than "attention." "hearing." "understanding." or any of the 
other possible subprocosses of listening, insofar as total listening scores 
are concerned. 1 

However, this study does add weight to the claims of external validity 
for the Watson-Barker instrument. Further testing of the relationship 
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between this listening test and Measures of "attention," "understanding," 
etc.. would help to increase confidence in this procedure. A store direct 
test of the relationship- between listening scores on the Watson-Barker 
tesl and physiological -arousal seems called for as well. 

One additional note of caution is called for. based on the research 
project outlined above. While many claims of "face validity" hare been 
made by the designers of listening tests, most of these tests seem, on' the 
surface, to fail that test of validity because of the single medium nature of 
the tost stimulus. Listeners generally do not. "listen" with just their earsi 
Listening typically takes place while the' listener is hearing end viewing: 
the sender of the ^ssage. While attempting to. assess the listener's f ■ 
ability to analyze the paraianguage messafce as well as the verbal message 
is indeed a useful pursuit, neglecting tomaasure the listener's ability to 
gain knowledge from the other aspects of nonverbal message 
transmission may reader the total testing procedure useless in terms of 
applying the results to OTeryday encounters. Efforts are being 
undertaken to develop a listening test that more accurately measures the * 
full range of decoding activities that the typical "listening" task involves, 
this new measurement procedure would include both the aural and the 
risual stimuli that are present in most communication situations. It is 
hoped that this new version of the W. itson-Barker Listening Test will be 
found to be an even more valid and reliable measure of that nebulous 
concept we call listening. k w 



12 BEST COPY AVAILABLE 



Roberts - page ten 

\ BIBLIOGRAPHY 

c : • . " • 

x 

Anastasis, A. Psychological Testing New York: Academic Press, 1961. 

r 

Anderson, H. & R. Baldauf. A study of a measure of listening , Journal of » , 
r , Educational Research, S7 i 1963), iQ?-?nn 

Barker. I. CQmmunicatlfln. Englewood Cliffs. N.J.rPrentlceJHaTT. Inc.. 1984. 

Beatty. M. Receiver apprehension as a function of cognitive backlog ', Western 
Journal of Speech Communication 4K Mom) 97K-r>-yo 

Beatty. M. The effects of anticipated listening anxiety on receiver 

apprehension scores. Centrel States Spee ch Journal. In grass 

4. 

Beatty. 11. & S. |Payne. Receiver apprehension and cognitive complexity.' 

Western Journal of Speech Communication 4S r l«ml, 363-369. 



r 



Bostrom , K. The Kentucky Comprehensive Listening Tasj Lexington: Ken - <y 
Listening Research Con or, 1983. ■ 

Bostrom, R. Research Update. Lexington. Kentucky Listening Research Center. 
1984. 

.'A ' 
Cofer, C. & M. Appley. Motivation: Theory and Research New York: John Wiley 
and Sons, 1964. 

Cohen, J. Statistical Power Analysis far the Behavioral §ejanjaa New York: 
Academic Press, 1977. 

Crane, L. R. Dieker, & C. Brown. Physiological responses to communication, 
In Listening; Bmfflma. Vol. 2 (S. Ouker, ed.). Metuchen, New 
Jersey: Scarecrow Press, 1971. 

Cronen, V. & N A Mtchevc. Evaluation of deductive argument: A process analysis, 
Speech Moonoyaoha 39 ( 1.972), 1 24 r 1 3 1 . 

Keller, P. Major findings in listening in the past ten years, Journal of 
Communication . 10 ( 1960), 29-38. 

Kelly, C. Actual listening behavior of industrial supervisors, as related 

to "listening ability, "general mental ability, selected personality 
factors and supervisory effectiveness, Unpublished Dissertation, 
Purdue University, 1962. 



13 BEST COPY AVAILABLE 



• k Roberts - page eleven 

Kelly, C An investigation of the construct validity of two commercially 
published listening tests. Speech Monooranhs. 32 ( lQfiS), 
139-143. 

■ * 

Kelly. C. tlstenlng: Complex of activities - and a unitary skill? Speech * 

iraohs. 34 ( 1 967), 455-465. 



Kelly, C. Menta) ability and personality factors In listening, Quarterly 
Journal of Speech. 49 ( 1963), lS9-lKfi 

Klefnsmith, L & S. Kaplan. The interaction of arousal and recall interval 
/ in nonsense syllable paired-associate learning, Jnurnaf of 
/ Experimental Psychology. 65 ( 1963), 190-193 

tevenian, E. Retention of information in relation to arousal during 

v continuously-presented material. American Educational Research 
jJojOSi. 4 ( 1 967), 1 03- 11 6. 

Petrie, C. An experimental evaluation of two methods for improving listening 
comprehension abilities, Unpublished dissertation, Purdue 
University, 1961. 



Roberts, C. A physiological validation of thb receiver apprehension test, 
Communication Resaarch Reports 1 (1984), 126-129. 

Roberts, C. The arousal -retention relationship in communication research. 

Paper presented at the Eastern Communication fesociatinn 
Convention. 1980. 

Roberts, C. & T. Steinfatt. Source credibility and physiological arousal: An 
important variable in the credibility-information retention 
relationship. Southern Soeech Communication Journal. 48 
(1983), 340-355. ^ 

Rosenthal , R. & R. Rosnow. Essentials of Behavioral Research. New York: 
McGraw-Hill Book Company, 1984. 

Watson, K. & L. Barker. Watson-Barker listening Test. New Orleans: Spectra, 
. I Inc., 1984. 

Wfieeless, L. An investigation of receiver apprehension and social context 
context dimensions of communication apprehension, Speech 
Teafeher . 24 ( 1975), 261-268. 



14 BEST COhr a 



